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Kepler Fourier Concepts: the performance of the Kepler data pipeline. 
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Given the extreme precision at tainable with t he Kepler Space Telescope, the mitigation of instrumental artefacts is very 
important. In an earlier paper (Murphvl l20l3) . the characteristics of Kepler data were discussed in light of their effect 
on asteroseismology. We continue this discussion now that data processed with the new PDC-MAP pipeline are publicly 
available; users should use the latest data reductions available at the Mikulski Archive for Space Telescopes (MAST), not 
just for PDC, but also for improvements in the attached meta-data. We discuss the injection of noise in the frequency range 
0-24 d -1 (up to ~277 ^tHz) by the PDC-LS pipeline into ~15 percent of light-curves. 
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1 Introduction 

The exquisite precision and long time-base of Kepler data 
make for excellent demonstrations of some fundamental 
concepts particular to Fourier tr ansforms. Many of these 
were presented in iMurphvl d20 1 2l hereafter PapeiT), which 
addressed characteristics of data of different sampling rates, 
namely of the Kepler short- and long-cadence data, with 
58.9-s and 29.4-min exposures, respectively. 

Paper I demonstrated some basic Fourier concepts as ap- 
plied to Kepler data, such as the importance of having short- 
cadence (SC) data for investigating 5 Set stars because of the 
aliasing generated by the low long-cadence (LC) Nyquist 
frequency, and also how the study of flares, for instanc e, can 
benefi t from the increased time-resolution (see, e.g. 



Balona 



2012). Furthermore, Paperl showed how SC data allow for a 
more precise determination of pulsation frequencies, ampli- 
tudes and phases, but do not offer greater frequency resolu- 
tion, and how the observed pulsation amplitudes in LC data 
suffer an amplitude-reduction effect due to under-sampling. 
Also presented therein was the performance of the Pre- 
search Data Conditioning (PDC) pipeline, whose job it is to 
remove instrumental systematic signatures whilst preserv- 
ing the astrophysics, compared to the Simple Aperture Pho- 
tometry (SAP) data, which undergo only basic calibration. 
It was shown that the PDC data of Data Release 1 1 and ear- 
lier showed much lower noise than their SAP counterparts, 
but that spurious low-frequency peaks were evident in the 
LC PDC data in non-pulsating stars. 

This article offers a continuation from PapeiT in which 
we further address a newly-discovered and characterised 
form of noise injection in the older version of the PDC 
pipeline, which used least-squares algorithms to process the 
data, and has thus been renamed PDC-LS to alleviate con- 



fusion - this characterisation is presented in §|2] An early 
performance review of the newer, Maximum A Posteriori 
PDC pipeline (PDC-MAP) is presented in §[3] We discuss 
the benefits of continuous Kepler coverage in §[4] The data 
presented in this paper pertain to Data Release 1 1 (or ear- 
lier) for PDC-LS, and to Data Release 14 for PDC-MAP. 



2 Noise injection by PDC-LS 

At the time of writing, all Kepler quarters have now been 
reprocessed with the newer PDC-MAP pipeline and are 
available on MAST, but PDC-MAP does not yet treat SC 
data, meaning that SC data are only available in the least- 
squares f ormat. For further re adin g on the pipelines, the 
papers of Stu mpe et al. ( 2012 ) and Smith et all ( 2012 ) are 



recommended, along with the Kepler Data Characteristics 
Handbook In this section we provide examples of how the 
PDC-LS pipeline injects noise into the data, how to detect 
it, and the on-going efforts to handle the problem. 

We demonstrate the way in which PDC noise injection 
can distort light curves in Fig.Q] and how this affects the 
Fourier transform in Fig. [2] One of the characteristic fea- 
tures of the injected noise is the drop-off in power around 
24d _1 . Specifically, in a logarithmic plot of the power 
spectrum an extremely rapid reduction in power is seen 
(Fig. [3]), and this was used to identify stars affected by the 
noise injection. Indeed, this power-reduction was observed 
in 15 per cent of the stars analysed - those with SC data 
available and Kepler Input Catalogue (KIC) temperatures 
between 5500 and 9500 K. The power-reduction is hard to 
detect in LC due to the low Nyquist frequency, but we stress 
that the LC data do show the injected noise. 

Fig.g] shows how for the star KIC 3429637 the PDC- 
LS pipeline injects noise into the SC Q7 data. The noise is 
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Fig. 1 Three-day segments of the light curves for the de- 
tached eclipsing binary system KIC 11285625 in SC PDC 
flux. In Q6.1 the pipeline is injecting noise, distorting the 
light-curve, whereas the Q6.2 light-curve looks compara- 
tively very clean. The corresponding Fourier transforms are 
shown in Fig. [2] It is well-known that PDC-LS does not treat 
binaries well, but the noise injection is not limited to eclips- 
ing binary systems. 
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Fig. 2 The prewhitened Fourier transforms for Q6.1 and 
Q6.2 corresponding to the light curves in Fig.Q] and typi- 
cal for stars in which the noise injection is seen. Pulsations 
and many harmonics of the orbit have been prewhitened; 
the appearance of the Fourier transform below ~5d _1 is 
strongly affected by those harmonics that remain. Upper 
panel: in the affected Q6.1 data, there is an elevated 'grass 
level' - the amplitude of the Fourier peaks that somewhat 
resemble mown grass - up to a frequency of ^24 d -1 af- 
ter which there is a sharp drop-off in power. Lower panel: 
the Q6.2 data do not suffer the noise injection. The sharp 
power-reduction is not noticeable in LC data (not shown) 
because of the low Nyquist frequency of 24.4 d _1 . 
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Fig. 3 The injected noise is easily identified by a rapid 
power reduction at 24 d -1 in the Fourier transform, plotted 
here in log-log space. The example given is KIC 3429637 
Q7 SC data, after pre-whitening, that is, fitting and remov- 
ing the statistically significant (signal-to-noise > 4) sine 
curves from this S Set star. 
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Fig. 4 Linear Fourier transforms for KIC 3429637, show- 
ing the extent of the injected noise: the grass level is 
~ 15 /xmag in the PDC data, compared to just 2 /imag in the 
SAP data. The decrease in power at ^24 d -1 is clear. The 
effect is present in Q7 in both SC and LC PDC data for this 
star, but not present at all in Q8. 



restricted solely to PDC data and affects Q7 but not Q8 for 
this star. The grass level (defined in Fig.|2]caption) is higher 
in PDC Q7 LC than in SAP Q7 LC (after pre-whitening), 
so the effect is present in LC too, even if the low Nyquist 
frequency hides the power drop. 

It is not known why some stars are affected while oth- 
ers are not, nor why, for the same star, only some quarters 
are affected and that these vary for different affected stars. 
No correlation in fraction of stars affected was seen with 
T G ff, nor was any correlation seen with sky/CCD position 
(Fig.0. 

Sometimes the pipeline fails to fit a light curve and 
passes the light-curve through without modification. In the 
lower panel of Fig. [2] it can be seen that although noise is 
not injected, low-frequency peaks, which are often caused 
by trends in the data, are not removed either. In this case, the 
light curve has perhaps emerged untouched by the pipeline. 
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Fig. 5 The spatial distribution of the affected stars cor- 
responds to the spatial density of all stars across Kepler's 
CCDs. 



The Kepler Science Office is working extremely hard to 
continually improve the pipeline, and every effort is made 
to visually inspect the light-curves at each data release. 
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Fig. 6 Although PDC-MAP removes many of the long- 
term trends satisfactorily, it has initially struggled to miti- 
gate thermal effects. For this reason, small exponential ther- 
mal recoveries are seen at the data downlink events in PDC- 
MAP, just as they do in the SAP data. The example shown 
is KIC 7450391 - a non-pulsating late A-type star, for Ke- 
pler Quarter 2, noting that Q2 is the most extreme example 
and that in other Quarters PDC-MAP does much better than 
this. 



3 Performance of PDC-MAP 



4 The benefits of continuous Kepler coverage 
of 5 Set stars 



At the time of writing, PDC-MAP has only very recently 
become available for public data. In comparing PDC-MAP 
with the older PDC-LS and the SAP data (Fig.©, it is clear 
that PDC-MAP has strong trends (over many cadences) fol- 
lowing the horizontal discontinuities in the data that arise 
from the scheduled data downlinking events. In the older 
pipeline these trends were fixed quite well. Such teething 
problems might be expected with new software, somewhat 
similarly to how the Copernican model of the solar system 
initially struggled to out-perform the Ptolmiac model be- 
cause of the substantially larger amount of time invested 
in improving the latter. PDC-MAP identifies systematic be- 
haviour common to many stars to create a fit to subtract 
from the data, but there appears to be a great variation be- 
tween the response of different pixels to the thermal effects 
that cause the trends, and this effect is convolved with dif- 
ferential velocity aberration, making the light-curves partic- 
ularly hard to treat. 

Paper I investigated the difference between SAP and 
PDC-LS data in Fourier space, noting that the noise in the 
latter was reduced by factors of 10-100 over the former. 
Here, we extend that comparison to also include the PDC- 
MAP data (Fig.|7]i. It appears that although PDC-MAP is 
slightly noisier at very low frequency due to the remaining 
thermal and focussing effects, at higher frequency (above 
~2d _1 ) it out-performs PDC-LS, and has the added reas- 
surance that it preserves stellar variability more satisfacto- 
rily than PDC-LS. From Quarter 13 onwards, a cubic fit will 
also be calculated and subtracted from the data in a pipeline 
version called 'multi-scale' MAP. It is therein assumed that 
long-term trends on the scale of around one month are sys- 
tematic and are thus removed. 



Kepler SC slots are oversubscribed, such that even though 
some SC data are desirable (though not required - to be 
demonstrated in a future publication) to resolve the LC 
Nyquist aliasing issues, it is mostly only feasible to contin- 
uously study a star in LC. Here we shall address the benefits 
of doing so. 

The 5 Set stars are not particularly stable pulsators. 
Their amplitudes and frequencies are subject to change, and 
the literature contains numerous examples such as 4CVn 
dBregei 2000), whose pulsation amplitudes have been ob- 
served to change unpredictably for decades. With continu- 
ous, space-based observations of such high precision, the 



poten tial for observing such changes is great. iMurphv et al. 



(12.012ft have used eight Quarters of Kepler data to inves- 
tigate the pulsational amplitude growth of the pPup star 
KIC 3429637. Such observations may prove indispensable 
in understanding mode interactions and the details of the 
driving mechanism and damping in operation in 5 Set stars. 

In addition, Shibahashi & Kurtzl ( 2012 ) have described 
a method of deriving the mass function of a binary pair us- 
ing the frequency-modulation (FM) technique on Fourier 
data, i.e. without the need of spectroscopy. Specifically, if 
one has continuous LC data covering at least a whole or- 
bit, and one of those stars pulsates (preferably at high am- 
plitude and frequency), then by using the sidelobes present 
in the Fourier transform one can calculate the parameters 
of that orbit. So far, only three 'FM star s' are known: the 
prototype in I Shibahashi & Kurtzl d2012i KIC4150611), a 
highly non-linear 6 Set star with many combination frequen- 
cies (KIC 11754974; Murphy et al. 2012b, in prep), and 
a third star current ly under study by Kurtz & Shibahashi. 
Telting et alj d2012l) have found a pulsating sdB in a binary 
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Fig. 7 Overlaid Fourier plot for SAP, PDC-LS and PDC-MAP for KIC 7450391, Quarter 2, corresponding to the light- 
curves in Fig. [6] Upper panel: below ~2d _1 PDC-MAP is slightly noisier than PDC-LS, but has lower noise above 
that frequency. Lower panel: zoom-in on the region below 2d -1 . SAP is substantially noisier, and the convergence in 
performance of PDC-MAP and PDC-LS can be seen as low as 0.5 d — 1 . PDC-MAP can thus be said to be at least as good 
as PDC-LS on time-scales important to astrophysics. 



for which the sidelobes were only of borderline significance 
and thus direct application of the FM technique was ineffec- 
tive. Under the right conditions, however, it is even possible 
to detect Jupiter-mass planets orbiting 5 Set stars with this 
technique. 

Finally, continuous coverage in eclipsing binary sys- 
tems, or even transiting planets is essential to precisely 
determine the orbital period and even allow detections 
of additional bodie s through Transit Timing Variations 
(Ball ard et al. 201 ll) . For this reason, the news that the Ke- 
pler Space Mission is receiving funding for an extension is 
particularly exciting. 
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